Overview

Dataset statistics

Number of variables16
Number of observations6362620
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory564.3 MiB
Average record size in memory93.0 B

Variable types

Numeric7
Categorical9

Alerts

nameOrig has a high cardinality: 6353307 distinct valuesHigh cardinality
nameDest has a high cardinality: 2722362 distinct valuesHigh cardinality
amount is highly overall correlated with oldbalanceDest and 1 other fieldsHigh correlation
oldbalanceOrig is highly overall correlated with newbalanceOrigHigh correlation
newbalanceOrig is highly overall correlated with oldbalanceOrigHigh correlation
oldbalanceDest is highly overall correlated with amount and 1 other fieldsHigh correlation
newbalanceDest is highly overall correlated with amount and 1 other fieldsHigh correlation
CASH_OUT is highly overall correlated with PAYMENTHigh correlation
PAYMENT is highly overall correlated with CASH_OUTHigh correlation
isFraud is highly imbalanced (98.6%)Imbalance
isFlaggedFraud is highly imbalanced (> 99.9%)Imbalance
DEBIT is highly imbalanced (94.3%)Imbalance
TRANSFER is highly imbalanced (58.5%)Imbalance
amount is highly skewed (γ1 = 30.99394948)Skewed
nameOrig is uniformly distributedUniform
oldbalanceOrig has 2102449 (33.0%) zerosZeros
newbalanceOrig has 3609566 (56.7%) zerosZeros
oldbalanceDest has 2704388 (42.5%) zerosZeros
newbalanceDest has 2439433 (38.3%) zerosZeros
day has 571039 (9.0%) zerosZeros
hour has 71587 (1.1%) zerosZeros

Reproduction

Analysis started2023-04-25 14:11:35.881651
Analysis finished2023-04-25 14:26:38.076629
Duration15 minutes and 2.19 seconds
Software versionydata-profiling vv4.0.0
Download configurationconfig.json

Variables

amount
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct5316900
Distinct (%)83.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean179861.9
Minimum0
Maximum92445517
Zeros16
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:39.098010image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2224.0995
Q113389.57
median74871.94
Q3208721.48
95-th percentile518634.2
Maximum92445517
Range92445517
Interquartile range (IQR)195331.91

Descriptive statistics

Standard deviation603858.23
Coefficient of variation (CV)3.3573437
Kurtosis1797.9567
Mean179861.9
Median Absolute Deviation (MAD)68393.655
Skewness30.993949
Sum1.1443929 × 1012
Variance3.6464476 × 1011
MonotonicityNot monotonic
2023-04-25T09:26:39.724138image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10000000 3207
 
0.1%
10000 88
 
< 0.1%
5000 79
 
< 0.1%
15000 68
 
< 0.1%
500 65
 
< 0.1%
100000 42
 
< 0.1%
21500 37
 
< 0.1%
120000 29
 
< 0.1%
135000 20
 
< 0.1%
0 16
 
< 0.1%
Other values (5316890) 6358969
99.9%
ValueCountFrequency (%)
0 16
< 0.1%
0.01 1
 
< 0.1%
0.02 3
 
< 0.1%
0.03 2
 
< 0.1%
0.04 1
 
< 0.1%
0.06 1
 
< 0.1%
0.07 1
 
< 0.1%
0.09 1
 
< 0.1%
0.1 1
 
< 0.1%
0.11 2
 
< 0.1%
ValueCountFrequency (%)
92445516.64 1
< 0.1%
73823490.36 1
< 0.1%
71172480.42 1
< 0.1%
69886731.3 1
< 0.1%
69337316.27 1
< 0.1%
67500761.29 1
< 0.1%
66761272.21 1
< 0.1%
64234448.19 1
< 0.1%
63847992.58 1
< 0.1%
63294839.63 1
< 0.1%

nameOrig
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct6353307
Distinct (%)99.9%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
C1784010646
 
3
C363736674
 
3
C2051359467
 
3
C1902386530
 
3
C1976208114
 
3
Other values (6353302)
6362605 

Length

Max length11
Median length11
Mean length10.482323
Min length5

Characters and Unicode

Total characters66695040
Distinct characters11
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6344009 ?
Unique (%)99.7%

Sample

1st rowC1231006815
2nd rowC1666544295
3rd rowC1305486145
4th rowC840083671
5th rowC2048537720

Common Values

ValueCountFrequency (%)
C1784010646 3
 
< 0.1%
C363736674 3
 
< 0.1%
C2051359467 3
 
< 0.1%
C1902386530 3
 
< 0.1%
C1976208114 3
 
< 0.1%
C1462946854 3
 
< 0.1%
C400299098 3
 
< 0.1%
C1999539787 3
 
< 0.1%
C1530544995 3
 
< 0.1%
C1677795071 3
 
< 0.1%
Other values (6353297) 6362590
> 99.9%

Length

2023-04-25T09:26:42.087354image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1784010646 3
 
< 0.1%
c1530544995 3
 
< 0.1%
c363736674 3
 
< 0.1%
c2098525306 3
 
< 0.1%
c1832548028 3
 
< 0.1%
c1065307291 3
 
< 0.1%
c724452879 3
 
< 0.1%
c1677795071 3
 
< 0.1%
c545315117 3
 
< 0.1%
c1999539787 3
 
< 0.1%
Other values (6353297) 6362590
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 8803448
13.2%
C 6362620
9.5%
2 6136135
9.2%
3 5699596
8.5%
4 5693146
8.5%
7 5669437
8.5%
5 5668010
8.5%
6 5667725
8.5%
0 5667074
8.5%
9 5665212
8.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60332420
90.5%
Uppercase Letter 6362620
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8803448
14.6%
2 6136135
10.2%
3 5699596
9.4%
4 5693146
9.4%
7 5669437
9.4%
5 5668010
9.4%
6 5667725
9.4%
0 5667074
9.4%
9 5665212
9.4%
8 5662637
9.4%
Uppercase Letter
ValueCountFrequency (%)
C 6362620
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 60332420
90.5%
Latin 6362620
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8803448
14.6%
2 6136135
10.2%
3 5699596
9.4%
4 5693146
9.4%
7 5669437
9.4%
5 5668010
9.4%
6 5667725
9.4%
0 5667074
9.4%
9 5665212
9.4%
8 5662637
9.4%
Latin
ValueCountFrequency (%)
C 6362620
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66695040
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8803448
13.2%
C 6362620
9.5%
2 6136135
9.2%
3 5699596
8.5%
4 5693146
8.5%
7 5669437
8.5%
5 5668010
8.5%
6 5667725
8.5%
0 5667074
8.5%
9 5665212
8.5%

oldbalanceOrig
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1845844
Distinct (%)29.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean833883.1
Minimum0
Maximum59585040
Zeros2102449
Zeros (%)33.0%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:42.809900image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median14208
Q3107315.18
95-th percentile5823702.3
Maximum59585040
Range59585040
Interquartile range (IQR)107315.18

Descriptive statistics

Standard deviation2888242.7
Coefficient of variation (CV)3.4636062
Kurtosis32.964879
Mean833883.1
Median Absolute Deviation (MAD)14208
Skewness5.2491364
Sum5.3056813 × 1012
Variance8.3419457 × 1012
MonotonicityNot monotonic
2023-04-25T09:26:43.494058image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2102449
33.0%
184 918
 
< 0.1%
133 914
 
< 0.1%
195 912
 
< 0.1%
164 909
 
< 0.1%
109 908
 
< 0.1%
181 908
 
< 0.1%
157 902
 
< 0.1%
146 899
 
< 0.1%
136 898
 
< 0.1%
Other values (1845834) 4252003
66.8%
ValueCountFrequency (%)
0 2102449
33.0%
0.05 1
 
< 0.1%
0.18 1
 
< 0.1%
0.21 1
 
< 0.1%
0.44 1
 
< 0.1%
0.67 1
 
< 0.1%
1 370
 
< 0.1%
1.02 1
 
< 0.1%
1.37 1
 
< 0.1%
1.38 1
 
< 0.1%
ValueCountFrequency (%)
59585040.37 1
< 0.1%
57316255.05 1
< 0.1%
50399045.08 1
< 0.1%
49585040.37 1
< 0.1%
47316255.05 1
< 0.1%
45674547.89 1
< 0.1%
44892193.09 1
< 0.1%
43818855.3 1
< 0.1%
43686616.33 1
< 0.1%
42542664.27 1
< 0.1%

newbalanceOrig
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct2682586
Distinct (%)42.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean855113.67
Minimum0
Maximum49585040
Zeros3609566
Zeros (%)56.7%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:44.331090image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q3144258.41
95-th percentile5980262.3
Maximum49585040
Range49585040
Interquartile range (IQR)144258.41

Descriptive statistics

Standard deviation2924048.5
Coefficient of variation (CV)3.4194852
Kurtosis32.066985
Mean855113.67
Median Absolute Deviation (MAD)0
Skewness5.176884
Sum5.4407633 × 1012
Variance8.5500596 × 1012
MonotonicityNot monotonic
2023-04-25T09:26:45.126923image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3609566
56.7%
9011.73 4
 
< 0.1%
7468.59 4
 
< 0.1%
8927.38 4
 
< 0.1%
4019.43 4
 
< 0.1%
7717.83 4
 
< 0.1%
36875.73 4
 
< 0.1%
7070.1 4
 
< 0.1%
10528.49 4
 
< 0.1%
7802.01 4
 
< 0.1%
Other values (2682576) 2753018
43.3%
ValueCountFrequency (%)
0 3609566
56.7%
0.01 1
 
< 0.1%
0.03 1
 
< 0.1%
0.05 1
 
< 0.1%
0.12 1
 
< 0.1%
0.13 1
 
< 0.1%
0.18 1
 
< 0.1%
0.21 1
 
< 0.1%
0.23 1
 
< 0.1%
0.3 1
 
< 0.1%
ValueCountFrequency (%)
49585040.37 1
< 0.1%
47316255.05 1
< 0.1%
43686616.33 1
< 0.1%
43673802.21 1
< 0.1%
41690842.64 1
< 0.1%
41432359.46 1
< 0.1%
40399045.08 1
< 0.1%
39585040.37 1
< 0.1%
38946233.02 1
< 0.1%
38939424.03 1
< 0.1%

nameDest
Categorical

Distinct2722362
Distinct (%)42.8%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
C1286084959
 
113
C985934102
 
109
C665576141
 
105
C2083562754
 
102
C1590550415
 
101
Other values (2722357)
6362090 

Length

Max length11
Median length11
Mean length10.481752
Min length2

Characters and Unicode

Total characters66691405
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2262704 ?
Unique (%)35.6%

Sample

1st rowM1979787155
2nd rowM2044282225
3rd rowC553264065
4th rowC38997010
5th rowM1230701703

Common Values

ValueCountFrequency (%)
C1286084959 113
 
< 0.1%
C985934102 109
 
< 0.1%
C665576141 105
 
< 0.1%
C2083562754 102
 
< 0.1%
C1590550415 101
 
< 0.1%
C248609774 101
 
< 0.1%
C1789550256 99
 
< 0.1%
C451111351 99
 
< 0.1%
C1360767589 98
 
< 0.1%
C1023714065 97
 
< 0.1%
Other values (2722352) 6361596
> 99.9%

Length

2023-04-25T09:26:46.522668image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
c1286084959 113
 
< 0.1%
c985934102 109
 
< 0.1%
c665576141 105
 
< 0.1%
c2083562754 102
 
< 0.1%
c1590550415 101
 
< 0.1%
c248609774 101
 
< 0.1%
c1789550256 99
 
< 0.1%
c451111351 99
 
< 0.1%
c1360767589 98
 
< 0.1%
c1023714065 97
 
< 0.1%
Other values (2722352) 6361596
> 99.9%

Most occurring characters

ValueCountFrequency (%)
1 8799996
13.2%
2 6133780
9.2%
3 5704404
8.6%
4 5691070
8.5%
8 5675627
8.5%
9 5668861
8.5%
7 5665128
8.5%
0 5664751
8.5%
6 5662897
8.5%
5 5662271
8.5%
Other values (2) 6362620
9.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 60328785
90.5%
Uppercase Letter 6362620
 
9.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 8799996
14.6%
2 6133780
10.2%
3 5704404
9.5%
4 5691070
9.4%
8 5675627
9.4%
9 5668861
9.4%
7 5665128
9.4%
0 5664751
9.4%
6 5662897
9.4%
5 5662271
9.4%
Uppercase Letter
ValueCountFrequency (%)
C 4211125
66.2%
M 2151495
33.8%

Most occurring scripts

ValueCountFrequency (%)
Common 60328785
90.5%
Latin 6362620
 
9.5%

Most frequent character per script

Common
ValueCountFrequency (%)
1 8799996
14.6%
2 6133780
10.2%
3 5704404
9.5%
4 5691070
9.4%
8 5675627
9.4%
9 5668861
9.4%
7 5665128
9.4%
0 5664751
9.4%
6 5662897
9.4%
5 5662271
9.4%
Latin
ValueCountFrequency (%)
C 4211125
66.2%
M 2151495
33.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 66691405
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 8799996
13.2%
2 6133780
9.2%
3 5704404
8.6%
4 5691070
8.5%
8 5675627
8.5%
9 5668861
8.5%
7 5665128
8.5%
0 5664751
8.5%
6 5662897
8.5%
5 5662271
8.5%
Other values (2) 6362620
9.5%

oldbalanceDest
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3614697
Distinct (%)56.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1100701.7
Minimum0
Maximum3.5601589 × 108
Zeros2704388
Zeros (%)42.5%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:47.177171image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median132705.66
Q3943036.71
95-th percentile5147229.7
Maximum3.5601589 × 108
Range3.5601589 × 108
Interquartile range (IQR)943036.71

Descriptive statistics

Standard deviation3399180.1
Coefficient of variation (CV)3.0881938
Kurtosis948.67413
Mean1100701.7
Median Absolute Deviation (MAD)132705.66
Skewness19.921758
Sum7.0033464 × 1012
Variance1.1554425 × 1013
MonotonicityNot monotonic
2023-04-25T09:26:47.696377image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2704388
42.5%
10000000 615
 
< 0.1%
20000000 219
 
< 0.1%
30000000 86
 
< 0.1%
40000000 31
 
< 0.1%
102 21
 
< 0.1%
198 19
 
< 0.1%
125 18
 
< 0.1%
132 18
 
< 0.1%
160 18
 
< 0.1%
Other values (3614687) 3657187
57.5%
ValueCountFrequency (%)
0 2704388
42.5%
0.01 1
 
< 0.1%
0.03 1
 
< 0.1%
0.13 1
 
< 0.1%
0.33 1
 
< 0.1%
0.37 1
 
< 0.1%
0.79 1
 
< 0.1%
1 7
 
< 0.1%
1.39 1
 
< 0.1%
1.64 1
 
< 0.1%
ValueCountFrequency (%)
356015889.4 1
< 0.1%
355553416.3 1
< 0.1%
355381433.6 1
< 0.1%
355380483.5 1
< 0.1%
355185537.1 1
< 0.1%
328194464.9 1
< 0.1%
327998074.2 1
< 0.1%
327963024 1
< 0.1%
327852121.4 1
< 0.1%
327827763.4 1
< 0.1%

newbalanceDest
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3555499
Distinct (%)55.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1224996.4
Minimum0
Maximum3.5617928 × 108
Zeros2439433
Zeros (%)38.3%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:48.689320image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median214661.44
Q31111909.2
95-th percentile5515715.9
Maximum3.5617928 × 108
Range3.5617928 × 108
Interquartile range (IQR)1111909.2

Descriptive statistics

Standard deviation3674128.9
Coefficient of variation (CV)2.9992978
Kurtosis862.15651
Mean1224996.4
Median Absolute Deviation (MAD)214661.44
Skewness19.352302
Sum7.7941866 × 1012
Variance1.3499223 × 1013
MonotonicityNot monotonic
2023-04-25T09:26:49.559561image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2439433
38.3%
10000000 53
 
< 0.1%
971418.91 32
 
< 0.1%
19169204.93 29
 
< 0.1%
16532032.16 25
 
< 0.1%
1254956.07 25
 
< 0.1%
1412484.09 22
 
< 0.1%
1178808.14 21
 
< 0.1%
4743010.67 21
 
< 0.1%
7364724.84 21
 
< 0.1%
Other values (3555489) 3922938
61.7%
ValueCountFrequency (%)
0 2439433
38.3%
0.01 1
 
< 0.1%
0.33 1
 
< 0.1%
1.39 1
 
< 0.1%
1.64 1
 
< 0.1%
1.74 1
 
< 0.1%
2.15 1
 
< 0.1%
2.45 1
 
< 0.1%
2.71 1
 
< 0.1%
2.76 1
 
< 0.1%
ValueCountFrequency (%)
356179278.9 1
< 0.1%
356015889.4 1
< 0.1%
355553416.3 2
< 0.1%
355381433.6 1
< 0.1%
355380483.5 1
< 0.1%
355185537.1 1
< 0.1%
328431698.2 1
< 0.1%
328194464.9 1
< 0.1%
327998074.2 1
< 0.1%
327963024 1
< 0.1%

isFraud
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
6354407 
1
 
8213

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Length

2023-04-25T09:26:50.456434image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:26:51.264883image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6354407
99.9%
1 8213
 
0.1%

isFlaggedFraud
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
6362604 
1
 
16

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Length

2023-04-25T09:26:51.844549image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:26:52.372000image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6362604
> 99.9%
1 16
 
< 0.1%

day
Real number (ℝ)

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.503158
Minimum0
Maximum30
Zeros571039
Zeros (%)9.0%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:53.016021image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q16
median9
Q313
95-th percentile20
Maximum30
Range30
Interquartile range (IQR)7

Descriptive statistics

Standard deviation5.922111
Coefficient of variation (CV)0.62317295
Kurtosis0.33160024
Mean9.503158
Median Absolute Deviation (MAD)4
Skewness0.37754388
Sum60464983
Variance35.071398
MonotonicityIncreasing
2023-04-25T09:26:53.662038image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 571039
 
9.0%
1 452761
 
7.1%
7 449147
 
7.1%
5 440626
 
6.9%
12 429335
 
6.7%
16 421098
 
6.6%
6 420282
 
6.6%
8 418103
 
6.6%
10 418006
 
6.6%
14 400706
 
6.3%
Other values (21) 1941517
30.5%
ValueCountFrequency (%)
0 571039
9.0%
1 452761
7.1%
2 6749
 
0.1%
3 21904
 
0.3%
4 12995
 
0.2%
5 440626
6.9%
6 420282
6.6%
7 449147
7.1%
8 418103
6.6%
9 392886
6.2%
ValueCountFrequency (%)
30 282
 
< 0.1%
29 11283
 
0.2%
28 55037
0.9%
27 14522
 
0.2%
26 8574
 
0.1%
25 13893
 
0.2%
24 58712
0.9%
23 33349
0.5%
22 50432
0.8%
21 52510
0.8%

hour
Real number (ℝ)

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.321454
Minimum0
Maximum23
Zeros71587
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size48.5 MiB
2023-04-25T09:26:54.251034image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile9
Q112
median16
Q319
95-th percentile22
Maximum23
Range23
Interquartile range (IQR)7

Descriptive statistics

Standard deviation4.321799
Coefficient of variation (CV)0.28207499
Kurtosis0.68407032
Mean15.321454
Median Absolute Deviation (MAD)3
Skewness-0.60564418
Sum97484591
Variance18.677947
MonotonicityNot monotonic
2023-04-25T09:26:54.757290image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%)
19 647814
10.2%
18 580509
9.1%
20 553728
 
8.7%
12 483418
 
7.6%
13 468474
 
7.4%
11 445992
 
7.0%
16 441612
 
6.9%
17 439941
 
6.9%
14 439653
 
6.9%
10 425729
 
6.7%
Other values (14) 1435750
22.6%
ValueCountFrequency (%)
0 71587
 
1.1%
1 27111
 
0.4%
2 9018
 
0.1%
3 2007
 
< 0.1%
4 1241
 
< 0.1%
5 1641
 
< 0.1%
6 3420
 
0.1%
7 8988
 
0.1%
8 26915
 
0.4%
9 283518
4.5%
ValueCountFrequency (%)
23 141257
 
2.2%
22 194555
 
3.1%
21 247806
 
3.9%
20 553728
8.7%
19 647814
10.2%
18 580509
9.1%
17 439941
6.9%
16 441612
6.9%
15 416686
6.5%
14 439653
6.9%

CASH_IN
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
4963336 
1
1399284 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 4963336
78.0%
1 1399284
 
22.0%

Length

2023-04-25T09:26:55.385235image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:26:56.009686image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 4963336
78.0%
1 1399284
 
22.0%

Most occurring characters

ValueCountFrequency (%)
0 4963336
78.0%
1 1399284
 
22.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4963336
78.0%
1 1399284
 
22.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4963336
78.0%
1 1399284
 
22.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4963336
78.0%
1 1399284
 
22.0%

CASH_OUT
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
4125120 
1
2237500 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 4125120
64.8%
1 2237500
35.2%

Length

2023-04-25T09:26:56.451349image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:26:57.100726image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 4125120
64.8%
1 2237500
35.2%

Most occurring characters

ValueCountFrequency (%)
0 4125120
64.8%
1 2237500
35.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4125120
64.8%
1 2237500
35.2%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4125120
64.8%
1 2237500
35.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4125120
64.8%
1 2237500
35.2%

DEBIT
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
6321188 
1
 
41432

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 6321188
99.3%
1 41432
 
0.7%

Length

2023-04-25T09:26:57.436557image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:26:57.939808image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 6321188
99.3%
1 41432
 
0.7%

Most occurring characters

ValueCountFrequency (%)
0 6321188
99.3%
1 41432
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6321188
99.3%
1 41432
 
0.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6321188
99.3%
1 41432
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6321188
99.3%
1 41432
 
0.7%

PAYMENT
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
4211125 
1
2151495 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row1

Common Values

ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Length

2023-04-25T09:26:58.766792image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:26:59.480043image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring characters

ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 4211125
66.2%
1 2151495
33.8%

TRANSFER
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size48.5 MiB
0
5829711 
1
 
532909

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters6362620
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 5829711
91.6%
1 532909
 
8.4%

Length

2023-04-25T09:27:00.277736image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-04-25T09:27:01.021874image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0 5829711
91.6%
1 532909
 
8.4%

Most occurring characters

ValueCountFrequency (%)
0 5829711
91.6%
1 532909
 
8.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6362620
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5829711
91.6%
1 532909
 
8.4%

Most occurring scripts

ValueCountFrequency (%)
Common 6362620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5829711
91.6%
1 532909
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6362620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5829711
91.6%
1 532909
 
8.4%

Interactions

2023-04-25T09:25:15.295212image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:35.016004image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:56.005956image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:16.672504image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:37.703827image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:08.645165image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:40.957225image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:20.755153image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:38.332285image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:58.825084image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:19.616393image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:42.187162image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:13.139205image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:45.726993image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:26.642355image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:41.231217image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:01.717644image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:22.558996image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:46.829120image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:17.101531image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:50.634920image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:31.557611image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:43.990324image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:04.584861image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:25.307677image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:51.268425image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:22.362211image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:55.822984image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:36.480803image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:46.820637image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:07.348922image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:28.091866image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:55.419186image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:27.133538image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:00.709665image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:41.334007image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:49.849032image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:10.329981image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:31.138687image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:59.925988image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:31.494253image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:05.461332image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:46.220246image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:22:52.985424image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:13.683228image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:23:34.119274image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:04.863296image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:24:36.255187image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2023-04-25T09:25:10.410728image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2023-04-25T09:27:01.827088image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
amountoldbalanceOrignewbalanceOrigoldbalanceDestnewbalanceDestdayhourisFraudisFlaggedFraudCASH_INCASH_OUTDEBITPAYMENTTRANSFER
amount1.0000.048-0.0710.5950.6700.006-0.0780.0490.0140.0160.0210.0020.0220.100
oldbalanceOrig0.0481.0000.8030.024-0.008-0.006-0.0020.0310.0030.4250.1660.0180.1620.068
newbalanceOrig-0.0710.8031.0000.044-0.094-0.0110.0110.0190.0050.4750.1860.0200.1800.076
oldbalanceDest0.5950.0240.0441.0000.936-0.004-0.0150.0020.0000.0030.0020.0020.0220.029
newbalanceDest0.670-0.008-0.0940.9361.000-0.004-0.0250.0020.0000.0020.0040.0010.0250.052
day0.006-0.006-0.011-0.004-0.0041.0000.0360.0620.0060.0070.0170.0080.0100.010
hour-0.078-0.0020.011-0.015-0.0250.0361.0000.1640.0060.0270.0960.0430.1120.016
isFraud0.0490.0310.0190.0020.0020.0620.1641.0000.0430.0190.0110.0030.0260.054
isFlaggedFraud0.0140.0030.0050.0000.0000.0060.0060.0431.0000.0010.0010.0000.0010.005
CASH_IN0.0160.4250.4750.0030.0020.0070.0270.0190.0011.0000.3910.0430.3800.161
CASH_OUT0.0210.1660.1860.0020.0040.0170.0960.0110.0010.3911.0000.0600.5260.223
DEBIT0.0020.0180.0200.0020.0010.0080.0430.0030.0000.0430.0601.0000.0580.024
PAYMENT0.0220.1620.1800.0220.0250.0100.1120.0260.0010.3800.5260.0581.0000.216
TRANSFER0.1000.0680.0760.0290.0520.0100.0160.0540.0050.1610.2230.0240.2161.000

Missing values

2023-04-25T09:25:48.278403image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-04-25T09:25:58.092079image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

amountnameOrigoldbalanceOrignewbalanceOrignameDestoldbalanceDestnewbalanceDestisFraudisFlaggedFrauddayhourCASH_INCASH_OUTDEBITPAYMENTTRANSFER
09839.64C1231006815170136.00160296.36M19797871550.00.00000100010
11864.28C166654429521249.0019384.72M20442822250.00.00000100010
2181.00C1305486145181.000.00C5532640650.00.00100100001
3181.00C840083671181.000.00C3899701021182.00.00100101000
411668.14C204853772041554.0029885.86M12307017030.00.00000100010
57817.71C9004563853860.0046042.29M5734872740.00.00000100010
67107.77C154988899183195.00176087.23M4080691190.00.00000100010
77861.64C1912850431176087.23168225.59M6333263330.00.00000100010
84024.36C12650129282671.000.00M11769321040.00.00000100010
95337.77C71241012441720.0036382.23C19560086041898.040348.79000100100
amountnameOrigoldbalanceOrignewbalanceOrignameDestoldbalanceDestnewbalanceDestisFraudisFlaggedFrauddayhourCASH_INCASH_OUTDEBITPAYMENTTRANSFER
636261063416.99C77807100863416.990.0C18125528600.000.0010302200001
636261163416.99C99495068463416.990.0C1662241365276433.18339850.1710302201000
63626121258818.82C15313014701258818.820.0C14709985630.000.0010302300001
63626131258818.82C14361187061258818.820.0C1240760502503464.501762283.3310302301000
6362614339682.13C2013999242339682.130.0C18504239040.000.0010302300001
6362615339682.13C786484425339682.130.0C7769192900.00339682.1310302301000
63626166311409.28C15290082456311409.280.0C18818418310.000.0010302300001
63626176311409.28C11629223336311409.280.0C136512589068488.846379898.1110302301000
6362618850002.52C1685995037850002.520.0C20803885130.000.0010302300001
6362619850002.52C1280323807850002.520.0C8732211896510099.117360101.6310302301000